next up previous contents
Next: Polysemy Up: Computational Issues Previous: Introduction

What we want from a Computational Lexicon

  The lexicon is the starting point of any NLP system. It must contain information about every potential word form which the system might come across, in order to guide processing. There are several levels of sophistication of (potential) NLP systems, and each of these levels requires lexical information of varying detail. I will attempt to identify different NLP tasks, their lexical needs, and the impact of ambiguity on them. I will consider the tasks of shallow and deep parsing, information retrieval, machine translation, natural language understanding, and natural language generation.

The previous discussion has highlighted the differing needs for lexical representation among various NLP systems. This representation can range from a very shallow list of morphological forms to a highly structured and fine-grained lexicon which derives from linguistic theory. I emphasised the task-dependency of the choice of lexical representation and structure -- the increased level of understanding which is necessary in an NLP system, the more important issues stemming from polysemy become and the more attention must be paid to the lexicon. I will review the problems underlying the represention of polysemy in Section 6.3, and will consider existing approaches to word sense disambiguation in Section 6.4.

How the lexicon can be built up and how it can be modified and updated when necessary, however, must be considered in the design of any computational lexicon. I will therefore discuss the problem of lexical acquisition in Section 6.5.


next up previous contents
Next: Polysemy Up: Computational Issues Previous: Introduction